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Abstract 



Although gender differences are fairly consistent when men and women report their general 
confidence, much less is known about the existence of such differences when subjects are 
asked to assess the degree of confidence they have in their ability to answer any particular 
test or exam question. The objective of this research was to investigate gender differences 
in item-specific confidence judgements. Data were collected from three different 
psychology courses containing 70 men and 181 women. After answering each item on 
course exams, students indicated their confidence that their answer to that item was correct 
Results showed that gender differences in confidence are dependent on the context 
(whether items were correct or wrong) and on the domain being tested. In addition, while 
both men and women were overconfident, undergraduate males were especially 
overconfident (and inappropriately sc) when incorrect 
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Highly Confident, but Wrong: Gender Differences 

and Similarities in Confidence Judgments 

Lack of confidence has frequently been cited as a reason inhibiting the persistence of 
women in higher education and certain prof essions, i.e., science and engineering (Dix, 
1987). Studies using general measures of confidence, such as grade prediction or potential 
ability to pass a test, have found that women are less confident than men in their abilities in 
mathematics, problem solving, and science (Campbell & Hackett, 1986; Hornig, 1987; 
Johnson, 1989; Matyas, 1984). This finding of lesser confidence in females has been 
observed at the sixth grade level (Fennema & Sherman, 1978), the junior and senior high 
level (Rosen & Aneshel, 1978), and the undergraduate and graduate levels (Dix, 1987). 
However, lack of confidence is not necessarily indicative of low ability. Even when female 
students achieve as well or better than their male counterparts, they tend to underestimate 
themselves (Fennema & Sherman, 1978; Zukerman, 1987). Moreover, this general lack of 
confidence does not end with graduation from the academy. Successful professional 
women may also underestimate their abilities and overestimate others' abilities, a tendency 
Clance & OToole ( 1988) labeled the "Imposter Phenomenon." 

Although gender differences are fairly consistent when men and women report their 
general confidence, much less is known about the existence of such differences when 
subjects are asked to assess the degree of confidence they have in their ability to answer 
any particular test or exam question. This kind of item-specific confidence is usually 
referred to in the literature on cognitive psychology under the heading of metamemory, or 
metacognition, comprehension monitoring, and feeling-of-knowing (i.e., Epstein, 
Glenberg & Bradley, 1984; Glenberg & Epstein, 1987; Maka & Berry, 1984; Pressley, 
Ghatala, Woloshyn & Pirie, 1990). One might expect (and hope) that students would 
express high confidence in items they knew the answer to, and low confidence in items 
they didn't know the answer to; that is, they would be able to distinguish between what 
they know and what they did not know in their confidence judgements. Interestingly, this 
research has pointed out that people often are unaware of wrong answers, and/or that they 
are usually overconfident in their estimated knowledge (e.g., Iichtenstein & Fishhoff, 
1981; Pressley, Ghatala, Woloshyn & Pirie, 1990). 

To date, few researchers have examined gender differences in item-specific confidence 
judgements. In one of the rare studies of this question, Iichtenstein & Fishhoff ( 198 1) 
failed to find gender differences in young adults' calibration of confidence for general 
world knowledge. In contrast, Jones & Jones ( 1989) found that the type of item and the 
achievement level of students resulted in gender differences in confidence judgements. 
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They asked subjects to decide whether or not they would get the answer right if they 
attempted four questions: two science questions and two mathematics questions (in each 
content area, one question was /amiliar to the students and one question was unfamiliar). 
Jones and Jones ( 1989) reported interactions in the ability level of the subject (high or low) 
and the type of question (science, math, familiar, unfamiliar). Overall, females were less 
confident on science questions and the unfamiliar math problem-solving question, but more 
confident than males on the familiar (computational) mathematics problem. The high ability 
females were less confident than both groups of males (and ironically, low ability females) 
on the science and mathematics problem solving questions, but more confident than those 
groups on the mathematics computation question. The authors concluded that high ability 
female students lacked confidence in their ability to answer novel questions. However, a 
potential problem is that Jones & Jones ( 1989) used only four item, a very limited sample. 

Jones & Jones ( 1989) work is valuable in stimulating researchers to further examine the 
contexts under which gender differences on item-specific confidence might occur. Item- 
specific confidence can be measured either before a subject has attempted to answer an 
item (as done by Jones & Jones, 1989), or after a subject has answered the item. The 
general literature in comprehension monitoring suggests that subjects are much better at 
estimating their confidence accurately after they have answered an item than prospectively 
(e.g.,Glenberg & Epstein, 1987). 

In the present research, we examine gender differences in calibration of confidence on a 
much larger number of items than previously studied, with subjects who answer each item 
before they estimate their confidence, in different courses, and with different achievement 
levels of students. It is important to note that previous studies were conducted in an 
experimental context, whereas this research is conducted within the context of actual 
couu.es. 

The basic objective of our research is to investigate potential gender differences in 
confidence judgements for material studied in college coursework. Two questions of 
special interest guiding the research are: 1) Are men more confident than women that jieir 
answers to exam question:, are correct?; and 2) Are men better calibrated in confidence than 
women; that is, do higher confidence ratings indicate appropriate accuracy and lower 
ratings inaccuracy? Most simply, do the students know what they know, and what they do 
not know. 

To further clarify the potential role of gender differences in confidence judgements, we 
asked three follow-up questions: 1) Does content of an item (e.g., math, science, 
psychology) affect gender differences in expressed confidence?; 2) Are students in the top 
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quartile more confident than bottom quartile students?; and 3) Are graduate students more 
confident than undergraduate students? 

Method 

Subjects, Data were collected from three different psychology courses: a two-quarter 
lower division laboratory methods sequence for Psychology majors (Lab 1 and Lab 2), and 
an upper-division undergraduate/graduate course in Human Learning and Memory 
(Memory). Of the 25 1 students enrolled in these courses, 70 were men and 181 were 
women: Lab 1 included 23 men and 74 women, Lab 2 included 25 men and 69 women, 
and Learning and Memory had 22 men and 38 women. 

Procedure. In all three courses students took a pretest and final exam. After answering 
each item, students were asked to indicate their confidence that their answer to that item 
was correct All students were told that their confidence judgements would have absolutely 
no bearing on their grade, and they were urged to give candid responses. In the Lab 
courses, students rated their confidence on a 5 point scale, with 1 - pure guess, 3 = mixed 
feelings of confidence and uncertainty, and 5 = very certain. In the memory course, 
students wrote their subjective confidence estimates on a numerical scale ranging from 50- 
100 % for true-false items and 25-100% for multiple-choice items. In addition to the 
pretest, students in the memory course had six quizzes in which they gave confidence 
estimates. 

Tests . The same pretest was used in both lab courses, consisting of 38 multiple-choice 
items measuring general science background, computational skills, experimental design, 
descriptive statistics and conceptual content (auditory psychophysics). The final exams in 
the Lab courses had 27 multiple-choice items (Lab 1) and 23 multiple-choice items (Lab 2) 
measuring the same areas as the pretest, with the exception of ge ieral science questions. 
The pretest in the Memory course had 25 true-raise and multiple-choice items; the objective 
portion of the final consisted of 23 true-false and 17 multiple-choice items. Both the 
Memory pretest and the final exam measured subject matter knowledge in the area of 
learning and memory. 

Results 

The primary finding of the present research is that gender differences in confidence are 
dependent on whether subjects were correct or incorrect in their answers and on the domain 
being tested. While most students were overconfident, they did adjust to some degree their 
confidence according to the accuracy of their answers. Women, however, showed more 
accurate perceptions of their potentially incorrect answers than did men, who tended to 
show inappropriately high degrees of confidence when wrong. This finding was 
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particularly true a undergraduate males, who were especially overconfident (again, 
inappropriately so) when they were incorrect 

Similarities and Differences in Confidence 

Table 1 compares the confidence of women and men when they answer correctly and 
when they are wrong. When we compare men's mean ievel of confidence when correct 



Insert Table 1 about here 



and wrong with women's mean levels of confidence we find course differences. In the lab 
courses, men's level of confidence (both when correct and when wrong) was slightly 
higher than women's confidence. On lab post-tests these differences, although numerically 
small, were significant (t (26) ■ 4.63 p < .001 for confidence when correct in Lab 1, and I 
(26) ■ 3.45 p < .002 for confidence when wrong in Lab 1; I (20) - 4.7 p. < .001 for 
confidence when correct in Lab 2, and i (20) - 4.45 p < .001 for confidence when wrong 
in Lab 2). However, in the upper division memory course, we found no gender 
differences in overall confidence. Both women and men were correct 69% of the time on 
the 40 items on the multiple-choice and true/false portion of the final exam, and both 
overestimated their likelihood of being correct 

Calibration. In general, both women and men showed a moderate degree of 
calibrarior in their confidence ratings (confidence when answers were correct being higher 
than confidence when wrong). There is little evidence of gender differences in overall 
calibration, except that women in Lab 2 are somewhat better at calibration on post-tests than 
are men. On two types of items, experimental design items and statistical items, women's 
confidence when correct significantly differed from their confidence when wrong, U5) 
=3.46, p <.018, and L£3) =4.9, p <.01, respectively; men's confidence in these areas did 
not differ significantly. 
Domain-specific Gender difference 

Confidence. Aggregating level of confidence across an entire test may obscure 
gender differences and similarities which may become evident when tests are broken down 
into specific item content groupings. Hie Memory course exams did not have component 
parts: all of the items tested psychological content However, the items in the Lab 1 and 
Lab 2 tests were comprised of the following components: science (primarily auditory 
psychophysics); mathematics (mostly computation); experimental design; and statistics. 
Table 2 presents the mean confidence levels for these four specific kinds of items when 
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subjects answered correctly, and when they answered incorrectly on the Lab 1 and lab 2 
post-tests. 



Insert Table 2 about here 



Of the 16 possible instances where gender differences might be obtained (in the two Lab 
courses and four content domains), mere were significant gender differences in 9. Of these 
nine, two-thirds occur in cases where subjects answered incorrectly (e.g., in statistics, 
when men were wrong, they were more inappropriately overconfident than women who 
were wrong [Lab 1: ft (3) = 4.32 p < .02; Lab 2: i (3) - 4.67 p < .C18]. Significant 
differences in confidence when correct and when incorrect and for both courses were 
evident in only one domain: items assessing computational skills ( 1 (6) = 3.35 p < .02 
when correct for computational items in Lab 1; 1 (6) = 2.91 p < .03 when incorrect for 
computational items in Lab 1 ; t (5) = 3.7 p < .01 when correct for computational items in 
Lab 2; t (5) « 2.5 p < .05 when incorrect for computational items in Lab 2.) However, 
mens' greater confident ; level was not evident consistently across courses and contexts in 
items assessing science, knowledge of experimental design, or assessing simple 
descriptive statistics. Moreover, where gender differences in confidence occur, they occur 
primarily when subjects answer items incorrectly. 

Calibration. In both lab courses, calibration of confidence is dependent both on gender 
and on the domain-specific nature of the items. Calibration here is operationally defined as 
significant differences between confidence when correct and confidence when wrong. 
Women calibrated their confidence 75% of the time; men calibrated their confidence half of 
the time. Both women and men showed significant differences in the calibration of science 
(auditory psychophysics) items [Lab 1: ft (9) ■ 3.7 p < .004 for women; t (9) ■ 3.26 p < 
.01 for men; Lab 2: t (9) ■ 3.9 p < .027 for women; i (9) « 3. 16 p < .051 for men]. On 
the statistics items, men showed no calibration, whereas women showed good calibration 
only in Lab 2 ft (3) - 4.9 p < .01). On experimental design items, men showed good 
calibration in Lab 1 ft (5) = 3.6 p < .015) but not Lab 2, whereas women showed good 
calibration in Lab 2 ft (5) ■ 3.46 p < .018) b it not Lab 1. On the items testing 
computational skills, women calibrated their confidence well in both courses; men 
calibrated their confidence well only in Lab 2. Figure 1 illustrates confidence when correct 
and confidence when wrong for women and men answering the math computational skills 
items in Lab 1. 
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Insert Figure 1 about here 



Women's confidence when correct on such items was significantly higher than when they 
were incorrect (t (6) = 7.7 p. < .001); men, however, showed no discernable 
discrimination. In Lab 2, men were better at discriminating correct from incorrect 
responses (i (5) = 2.8 p. < .04); women consistently showed this metacognitive ability (1 (5) 
■ 5.2 p < .003). In general, when men were wrong, their confidence was close to 4 on the 
5 point scale ("reasonably certain"), whereas when women were wrong, their confidence 
was closer to "mixed feelings of confidence and uncertainty". Overall then, although both 
groups were overconfident, men were consistently more confident than they should have 
been when they were wrong. Moreover, women showed much greater tendency to 
calibrate their confidence than did men. 
Quartile Differences in Calibration of Confidence 

An examination of confidence estimations of students in the top and fourth quartiles of 
the lab classes also reveals several interesting gender differences. Figure 2 



Insert Figure 2 about here 



illustrates the calibration of confidence by women and men in the top and fourth quartiles of 
Lab 1 and Lab 2. Although men and women in the top quartiles in Lab 1 are equally 
confident in incorrect answers, men in the top quartile in Lab 2 and in the fourth quartile in 
both courses are much more confident when wrong than women are. Indeed, men in the 
lower quartile of Lab 1 are so overconfident that their mean confidence in incorrect answers 
was higher than the mean confidence of women in that quartile in their correct answers! 
The confidence of men in the fourth quartile in Lab 1 when they are wrong is higher than 
their confidence when correct, and approximately equal to the confidence when wrong of 
men in the top quartile. In Lab 2, men in the top quartile show little awareness of wrong 
answers, and are "very certain" they are correct when they are wrong. Women, however, 
are more aware of incorrect answers, and show greater ability to calibrate their confidence 
than do men. 
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Confidence of Undergra duate anri Oarf uate Women Men 

The memory course included graduate and upper division undergraduate students, and 
we were able to compare the confidence of women and men at these levels. Table 3 shows 
the mean confidence of graduate and undergraduate women and men, as well as their 
overall accuracy on exam items. 



Insert Table 3 about here 



As Table 3 illustrates, both men and women at both levels are overconfident in the accuracy 
of their answers, with confidence estimates ranging from 9 to 13 points higher than the 
actual percent correct. The trend observed in the lower division lab courses-men's 
confidence higher than women's confidence— was not replicated here; in fact, just the 
opposite trend occurred. At both levels, women's confidence was slightly higher than 
men's, with one exception: When wrong, men gave higher confidence estimates, 
especially undergraduate men. Undergraduate men's overall confidence when correct 
(78%) was not significantly different than when they were incorrect (75%). 
Undergraduate men in this course were thus quite inappropriately confident even when they 
were wrong! 

Discussion 

In general, we found scant evidence to support the notion that women lack confidence; 
any such finding here must be qualified by the particular course involved ant by the 
domain-specific nature of the examination items. Bom women and men (but especially 
undergraduate men) were more confident than warranted in the accuracy of their answers. 
Apparently, women and men give very different confidence scores when prospectively 
estimating general feelings of confidence than they do in estimating their confidence in the 
accuracy of their answers to specific items. Among other things, this finding raises 
questions about generalizing from people predicting their confidence on a task they have 
not yet tried to the confidence they feel after answering specific questions. 

An important finding is that item-specific gender differences in confidence are dependent 
on the content of questions asked. In certain domains, such as mathematics, men were 
more confident than were women, while in other domains (e.g., learning and memory, 
experimental design), no such difference was observed. These results are consistent with 
findings that gender differences in performance on achievement tests in math (Hyde, 
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Fennema & Lamon, 1990; Linn & Petersen, 1986) and science (Linn & Petersen, 1986) 
are dependent on the content (e.g., biology or physical science) or type (e.g., computation 
or problem solving) of item. Moreover, research on 3ex differences in elementary 
children's causal attributions for academic success or failure also supports content-specific 
confidence: sex differences were found in social studies and science but not in reading, 
language or math (Licht, 1987). In a meta-analysis of sex differences in causal 
attributions, there were no gender differences in attributions overall: differences were 
dependent on context and cask influences (Whitley, McHugh & Frieze, 1986). Thus, these 
content influences in confidence, achievement, and attributions provide strong support for 
an interactionist theory of gender differences. As Linn ( 1986) noted, "Far from being well 
established and straightforward, gender differences are responsive io a large range of 
situational factors and background knowledge" (p. 221). 

Furthermore, on certain types of items (i.e., computation), we found that women were 
better than men at calibrating their confidence. Female superiority at calibration of 
confidence is consistent with results on sex differences in the confidence of younger 
children. Both 6-8 and 9-1 1 year old girls were more aware than boys that their answers to 
difficult items might be wrong. Older girls showed even greater discrimination between 
confidence when correct and confidence when wrong than younger girls, whereas boys' 
level of certainty in wrong answers remained the same regardless of age (Pressley, Levin, 
Ghatala & Ahmad, 1987). This gender Jifference of male overconfidence, especially on 
hard items, was replicated in a later study with first and second graders, though not with 
fourth and fifth grade children (Pressley & Ghatala, 1989). Calibration of confidence is an 
important aspect of metacognition. Certainly, knowing what one knows and what one 
doesn't know has important implications for study behaviors. Future studies might well 
examine how to help students better calibrate their confidence judgements. A start along 
this line has recently been made by LeCount & Fox (1992). 

In sum, the present investigation suggests that the problem may not be that women 
necessarily lack confidence, but that in some cases men have too much confidence, 
especially when they are wrong! The typical perception of women's lack of confidence, 
rather than men's overconfidence, may be the result of comparing prospective general 
confidence rather than retrospective and task or item-specific confidence. In this study, 
unlike many situations in life, we were able to use an objective standard (accuracy of 
answer) to judge confidence which eliminates the problem of using men's level of 
confidence as the norm (Roberts, 1991) or, for that matter, women's. Using this standard 
highlights limitations of using strictly male behavior as normative. Clearly, being 
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overconfident when wrong may not be a very desirable trait in most situations, as American 
humorist Josh Billings put it over a hundred years ago: 

"It's not what a man don't know that makes him a fool, but what he does know that ain't 
so" (1974, p. ) Indeed, a growing recognition of this tendency toward male 
overconfidence in wrong answers was labeled "the male answer syndrome" in a recent 
popular magazine, and attempted to explain "why men always have opinions even on 
subjects they know nothing about" (Campbell, 1992, p. 107). Perhaps the question that 
should be pursued is not why women are less confident than men, but why in our culture 
we consider it aberrant behavior to recognize and admit uncertainty. 
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Note. Asterisks represent significance at .05 or greater levels. Confidence 
numbers for Lab 1 and Lab 2 represent subjects' estimate that their answer 
is correct, based on a scale of 1-5 with 1= pure guess and 5= very certain. 
Confidence numbers for Memory course represent subjects' estimate that 
their answer was correct, based on a scale of 50%- 100% for true/raise 
items, and 25%- 100% scale for multiple choice items. 
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4.47* 


3.75 


4.15* 


S.D. 


(.09) 


(.52) 


(.30) 


(.48) 






Experimental 


Design 




CF Correct 


3.99 


4.24 


4.06 


4.27 


S.D. 


(.40) 


(.36) 


(.41) 


(.26) 


CF Wrong 


3.79 


3.64 


3.54 


3.88* 


S.D. 


(.41) 


(.50) 


(.3T 


(.47) 






Statistics 




CF Correct 


3.90 


4.20 


3.90 


4.02 


S.D. 


(.54) 


(.14) 


(.33) 


(.40) 


CF Wrong 


3.39 


4.01* 


3.18 


3.99* 


S.D. 


(.37) 


(.25) 


(.61) 


(.60) 



Note. Asterisks represent significant differences at .05 or greater levels 
between male and female confidence. Confidence scores for Lab 1 and Lab 
2 represent subjects' estimate that their answer is correct, based on a scale 
of 1-5 with 1= pure guess and 5= very certain. 
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Table 3 Mean Confidence of Undergraduate and Graduate 
Women and Men 







Undergraduate 


Graduate 




Women 


Men 


Women Men 


Accuracy 


66 


66 


78 72 


Confidence 


79.1 


76.3 


90.1 82.2 


S.D. 


(7.9) 


(8.8) 


(6.1) (7.8) 


CF Correct 


81.8 


78.7 


92.1 85.4 


S.D. 


(9.9) 


(10.6) 


(5.8) (8.4) 


CF Wrong 


73.8 


75.4 


82.3 70.3 


S.D. 


(11.7) 


(14.7) 


(19.3) 17.8) 



Note, Accuracy numbers represent the percentage correct. Confidence 
numbers represent subjects 1 estimate that their answer is correct, based on 
a scale of 50%-100% for true/false items and 25%-100% scale for multiple 
choice items. 
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CfCorWomen 
CfWrgWomen 
CfCorMen 
CfWrgMen 



Computational Skills Items 



Figure 1. Lab 1 mean confidence of women and men on computational skills items. 
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0 CfCorWomen 
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Fourth 
Labi 



Fourth 
Lab 2 



Quartlles 



Figure 2. Mean confidence ratings from the upper and lower quartiles 
of women and men in Lab 1 and Lab 2. 



JO 



9 

ERIC 



